NITI: Training Integer Neural Networks Using Integer-Only Arithmetic

نویسندگان

چکیده

Low bitwidth integer arithmetic has been widely adopted in hardware implementations of deep neural network inference applications. However, despite the promised energy-efficiency improvements demanding edge applications, use low for training remains limited. Unlike inference, demands high dynamic range and numerical accuracy quality results, making low-bitwidth particularly challenging. To address this challenge, we present a novel framework called NITI that exclusively utilizes arithmetic. stores all parameters accumulates intermediate values as 8-bit integers while using no more than 5 bits gradients. provide necessary during process, per-layer block scaling exponentiation scheme is utilized. By deeply integrating with rounding procedures entropy loss calculation, proposed incurs only minimal overhead terms storage additional computation. Furthermore, hardware-efficient pseudo-stochastic eliminates need external random number generation to facilitate conversion from wider results lower precision storage. Since operates standard storage, it possible accelerate existing operators originally developed commodity accelerators. demonstrate this, an open-source software implementation end-to-end training, native operations modern GPUs presented. In addition, experiments have conducted on FPGA-based accelerator evaluate advantage NITI. When compared equivalent setup implemented floating point arithmetic, degradation MNIST CIFAR10 datasets. On ImageNet, achieves similar state-of-the-art frameworks without relying full-precision floating-point first last layers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardwar...

متن کامل

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand, while a lot of research has also happened in the domain of low and mixed-precision Integer training, these works either present results for non-SOTA networks (for instance only A...

متن کامل

Channel Smoothing using Integer Arithmetic

This paper presents experiments on using integer arithmetic with the channel representation. Integer arithmetic allows reduction of memory requirements, and allows efficient implementations using machine code vector instructions, integer-only CPUs, or dedicated programmable hardware such as FPGAs possible. We demonstrate the effects of discretisation on a non-iterative robust estimation techniq...

متن کامل

Training Neural Networks with 3–bit Integer Weights

In this work we present neural network training algorithms, which are based on the differential evolution (DE) strategies introduced by Storn and Price [Journal of Global Optimization. 11:341–359, 1997]. These strategies are applied to train neural networks with 3–bit integer weights. Integer weight neural networks are better suited for hardware implementation than their real weight analogous. ...

متن کامل

Widening Integer Arithmetic

Some codes require computations to use fewer bits of precision than are normal for the target machine. For example, Java requires 32-bit arithmetic even on a 64-bit target. To run narrow codes on a wide target machine, we present a widening transformation. Almost every narrow operation can be widened by signor zero-extending the operands and using a target-machine instruction at its natural wid...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Parallel and Distributed Systems

سال: 2022

ISSN: ['1045-9219', '1558-2183', '2161-9883']

DOI: https://doi.org/10.1109/tpds.2022.3149787